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Abstract. A natural liberalization of Datalog is used in the Dis- 
tributed Knowledge Authorization Language (DKAL). We show 
that the expressive power of this liberal Datalog is that of existen- 
tial fixed-point logic. The exposition is self-contained. 



1. Prologue 

Existential fixed point logic (EFPL) differs from first-order logic by 
prohibiting universal quantification (while allowing existential quan- 
tification) and by allowing the "least fixed point" operator for positive 
inductive definitions. A precise definition is given below. 

Our original motivation for developing EFPL in [1] was its appro- 
priateness for formulating pre- and post-conditions in Hoare's logic of 
asserted programs [8]. In particular, the expressivity hypothesis needed 
for Cook's completeness theorem [1] in the context of first-order logic 
is automatically satisfied in the context of EFPL. 

But it turned out that EFPL has many other interesting properties. 

(1) EFPL captures polynomial time computability on the class of 
structures of the form {0, 1, . . . ,n} with (at least) the successor 
relation and names for the endpoints. 

(2) The set of logically valid EFPL formulas is a complete recur- 
sively enumerable set. 

(3) The set of satisfiable EFPL formulas is a complete recursively 
enumerable set. 

(4) The set of EFPL formulas that hold in all finite structures is a 
complete co-r.e. set. 

(5) When an EFPL formula is satisfied by a tuple of elements in a 
structure, this fact depends on only a finite part of the structure. 
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(6) No transfinite iteration is needed when evaluating EFPL for- 
mulas, by the natural iterative process, in any structure^ 

(7) EFPL can define (given appropriate syntactic apparatus) truth 
of EFPL formulas. 

(8) Truth of EFPL formulas is preserved by homomorphisms. 

(9) If an EFPL formula and a first-order formula are equivalent, 
then they are equivalent to an existentialfl first-order formula. 

Except for (7), which will be proved elsewhere, all these results are 
in [1]. The combination of (2) and (3) is surprising; it is possible 
because EFPL is not closed under negation. That is also why (7) 
doesn't contradict Tarski's theorem on undefinability of truth. 

Recently, EFPL has found a new application as the logical underpin- 
ning of the distributed knowledge authorization language DKAL [6l[7]. 
For this application, it was useful to recast EFPL in a form that looks 
similar to Datalog; it was called liberal Datalog in [7]. The purpose of 
the present note is to show exactly how the logic programs of liberal 
Datalog correspond to the formulas of traditional EFPL. 

2. Introduction 

Quisani: Let's return to existential fixed-point logic. We discussed it 
once [2], yet something bothers me about the definition. 
Authors]^ Before we get to what's bothering you, let's be sure you 
have the correct definition from [1]. 

Q: I think I know the definition all right, but to be safe let me check 
it with you: After making the convention that predicate symbols are 
classified as positive or negatable, one defines terms and atomic formulas 
just as in first-order logic. Compound formulas are built by 

• negation, applied only to atomic formulas whose predicate sym- 
bol is negatable, 

• conjunction and disjunction, 

• existential quantification, and 

• the LET-THEN construction. 



This means that the closure ordinal of each of the iterations is at most lu, the 
first infinite ordinal. Contrast this with what happens when the least fixed point 
operator is added to full first-order logic or just to its universal fragment. As shown 
in the appendix to [T] , arbitrarily large closure ordinals are possible there. 

^We could nearly say "existential positive" here. Negative occurrences are needed 
in the existential formula only for those predicate symbols that are negative in the 
vocabulary of the EFPL formula. 

^Not necessarily speaking in unison. 



TWO FORMS OF ONE USEFUL LOGIC 



3 



All but the last of these have their traditional meanings as in first-order 
loEfic. The LET-THEN constructior0 produces formulas of the form 

LET Fi(f^) ^ (^1, . . . , Pk{x^) ^ 5k THEN ^ 

where the Pj's are distinct, new, positive predicate symbols and the 5i 
and Tp are EFPL formulas in the vocabulary expanded by addition of 
these Pj's. Semantically, this formula means to use the 5iS to define 
a monotone operator on /c-tuples of predicates; given a tuple of (in- 
terpretations of) the Pj's, see which tuples satisfy the (5j's, and use 
those sets of tuples as the new interpretation of the Pj's. Repeat this 
operation until you reach a fixed point. Finally, use this least fixed 
point of the operator to interpret the Pj's in ip. 

A: That's right. You tacitly assumed a specific vocabulary T when 
you said that the Pj's are new, meaning they're not in T. In other 
words, although they occur in the LET-THEN formula you mentioned, 
they don't count as part of the vocabulary of that formula. 

Q: Right. I think of them as bound predicate variables. They could, 
for example, be renamed without affecting the meaning of the formula 
(as long as there are no clashes). 

A: Indeed, bound predicate variables are just what the Pj's become 
when the formula is translated into second-order logic (as in Theorem 5 
of [1]). That reminds us of another comment about your description 
of the semantics. You repeatedly applied the operator defined by the 
(5j's, until a fixed point is reached. But the semantics merely requires 
the least fixed point; it doesn't care about its explicit construction. 

Q: OK. I guess I was giving a sort of operational semantics, whereas 
the logic is defined in a purely denotational way. 

But I've found it convenient to think about EFPL operationally, 
especially when trying to write EFPL formulas to express particular 
properties. For example, I once checked that, in the standard model of 
arithmetic, = (A^, 0, 1, -|-, ■, <), the property of being a prime number 
is expressible by an EFPL formula. 

A: By Matiyasevich's solution of Hilbert's tenth problem, this property 
— indeed any recursively enumerable property of natural numbers — 

^Othcr notations are "let ... in . . . "and "letrec ... in . . . ." We retain "then" 
mainly for consistency with [1]. "Then" also serves as a reminder that, when 
expressed in second-order logic, the contruction amounts to an implication: "If you 
interpret the P,;'s in such a way that each Pi is implied by the corresponding (5^, 
then -0 holds." 
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is expressible in by an existential first-order formula; you don't need 
the fixed point operator. 

Q: I know, but I was looking for a formula that directly expresses 
what it means to be prime, without detouring through clever Diophan- 
tine tricks. The formula I constructed was actually fairly complicated, 
mainly because of the need to simulate two bounded universal quanti- 
fiers. A natural definition of "x is prime" is that no u < x divides x 
unless u = 1, and a natural definition of "tt doesn't divide x" (as long 
as 1 7^ < x) is that there is no < x such that u ■ v = x. These are 
the two bounded universal quantifications. In [1] , you showecH how to 
replace some cases of universal quantification with EFPL descriptions 
of searches through the domains of quantification. For example, in the 
case of arithmetic, (iw < x)P{w) can be replaced with 

LET Y{u) ^u = 0y3w[u = w + lA Y{w) A P{w)] THEN Y{x). 

I applied this idea, replacing each of the two universal quantifiers in 
the definition of "prime" by a search. Here's what I came up with, 
assuming that equality is negatable. 

LET Y{u,v) ^ 

n = V 3w [u = w + 1 A Y{w, v) A {w ■ v ^ x \/ w = 1\/ w = x) 
THEN 

LET X{u,v) ^ 

v = Q\/3w\v = w + lA X{u, w) A Y{u, w)] 
THEN 1 <xAX{x,x). 

A: That looks correct. The first LET-clause makes Y{u,v) express "x 
is not the product of anything < u with v, except for trivial products 
1 ■ X and X ■ 1," and then the second LET-clause makes X{u, v) express 
"x is not a nontrivial product of anything < u and anything < v." So, 
as you said, the two LET-clauses replace the two universal quantifiers 
in the natural first-order definition of "prime." By the way, you don't 
really need that equality is negatable, since you can replace w ■ v ^ x 
with {w ■ V < x) V {x < w ■ v). 

Q: That's right. And I don't really need < since a < 6 is equivalent to 
3y {a + y + 1 = b). But I wasn't trying to minimize the vocabulary; I 
just wanted to make sure I see how to formalize things in EFPL. 

A: OK. Did you notice that, although your formulas define the desired 
predicates on all of N, you could have cut off the searches at x, for 



'See the proof of Theorem 3 in [I] . 
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example by adding a conjunct w < x for each 3w7 The resulting finite 
searches would still define "prime" correctly. 

Q: I thought of that, but I decided the formula was long enough already. 

A: In any case, it's clear that you know what EFPL is; so what's the 
problem with the definition? 

Q: I thought I knew EFPL until I saw your extended abstract about 
the distributed knowledge authorization language, DKAL [7]. 

A: That abstract isn't by the two of us; it's by one of us and Itay 
Neeman. 

Q: I know, but Itay isn't here just now and you are, so I hope you can 
clarify the situation for me. 

A: We'll try. What exactly needs to be clarified? 

Q: Section 2 of the extended abstract claims to be about existential 
fixed-point logic (EFPL), but it looks quite different from the logic that 
I learned from your paper [1] and described to you here. In particular, 
that section of the abstract hardly mentions quantifiers at all and makes 
no distinction between existential and universal quantifiers, whereas 
that distinction was crucial in [1]. 

So I decided to look at the full tech report [6]. Its Section 2 is very 
similar to that of the extended abstract. Its Appendix A. 3 contains a 
quick, prose description of EFPL as defined in [1] but then ignores that 
and talks about logic programs and queries instead, just as Section 2 
did. 

As a result, I'm wondering about the connection between the "logic 
programs plus queries" picture in [71 [6] and the traditional picture of 
EFPL in p. 

A: The traditional picture in [1] corresponds exactly to the logic pro- 
grams aspect described in the DKAL paper [6]. The queries in the 
latter paper are outside EFPL, because they include universal quan- 
tification, at least in certain circumstances. 

When only relational structures are considered, so that logic pro- 
grams amount to Datalog, their equivalence with EFPL is in [3j. Grohe 
mentioned in the introduction of [5j that it extends to the case of vo- 
cabularies that include function symbols. 

Q: Is the general case proved there? If not, can you show me in detail 
how logic programs correspond to traditional EFPL formulas? 

A: We don't recall seeing a published source for the details of the 
correspondence in the general case. First, let's state the correspondence 
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precisely. We deal with structures X for a vocabulary in which all 
predicate symbols are negatable. (So positive predicate symbols will 
arise only as the P's in LET-clauses.) 

Theorem 1. The relations definable, in a structure X , by EFPL for- 
mulas are the same as the superstrate relations obtained, over the sub- 
strate X , by logic programs. 

The proof is in two parts, namely translations in both directions be- 
tween the two formalisms. Furthermore, the translations are uniform; 
that is, they do not depend on the structure X. Incorporating this 
uniformity into the statement of the theorem, we have the following 
more complete formulation. 

Theorem 2. For every EFPL formula with free variables among 
Xi, . . . , Xn, there is a logic program 11 with a distinguished n-ary super- 
strate relation P such that, in every structure X , the interpretation of P 
defined by 11 consists of exactly the n-tuples that satisfy Lp. Conversely, 
given a logic program IT and a distinguished superstrate predicate sym- 
bol P , there is an EFPL formula ip defining, in every structure X , the 
set of n-tuples that H produces as the interpretation of P. 

Q: You said "every structure X" but surely you intended some restric- 
tion on the vocabulary of X. 

A: You're right. The vocabulary of X should the same as that of (p. 
That's also the substrate vocabulary of 11, while the full vocabulary of 
the program 11 includes, in addition, P and (possibly) other superstrate 
predicate symbols. 

3. From Logic Programs to Formulas 

A: Let's begin by considering a logic program of the sort described in 
[6]. To recapitulate that description, we have a vocabulary T divided 
into a substrate part (which may contain relation and function symbols) 
and a superstrate part (containing only relation symbols). Substrate 
(resp. superstrate) formulas are those whose relation symbols are all 
in the substrate (resp. superstrate) part of T; note that a superstrate 
formula is allowed to contain substrate function symbols. 

Q: What's the intuition behind substrate and superstrate? 

A: The idea is that the substrate relations are given to us and the 
superstrate relations are computed by means of the program. That 
intuition is reflected in the semantics, which we'll review in a moment, 
but first let's finish the description of the syntax of logic programs. 
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A logic rule in the vocabulary T has the form H ^ B, where the head 
H is an atomic superstrate formula and the body B is a. conjunction 
of atomic superstrate formulas and possibly a quantifier-free substrate 
formula. A logic program is a finite set of logic rules. 

Semantically, a logic program is to be interpreted in a given structure 
X for the substrate vocabulary. It defines interpretations for the super- 
strate relations as the least fixed point of all the rules in the program. 
That is, interpret each rule 

R{ti, . . .,tn) ^ B 

in the program as an instruction to increase the current interpretation 
of R by adding all those tuples (ai, . . . , an) of elements of X such that 
some assignment of values (in X) to the variables makes B true and 
gives each ti the value Oj. Formally, this amounts to an operator F on 
tuples of relations regarded as interpreting all the superstrate relations 
(or, equivalently, on T-structures whose reduct to the substrate is X). 
Repeatedly apply this operator until a fixed point is reached. The 
desired interpretations of the superstrate relations constitute the least 
(with respect to componentwise inclusion) fixed point of F. 

Q: This definition reminds me of something else that I wanted to ask 
you. In [7j, you called this language "liberal Datalog," but, since you 
allow function symbols, it looks to me like pure Prolog. Isn't the pres- 
ence or absence of function symbols the essential difference between 
Prolog and (constraint) Datalog? 

A: The intended semantics of Prolog uses an Herbrand universe, which 
means a structure where every element is denoted by a unique ground 
term. The substrate structures of liberal Datalog are quite arbitrary. In 
particular, the functions of the structure need not be free constructors. 

Q: So liberal Datalog is liberal even when compared to pure Prolog. 

A: That's right. 

Now let's see that the superstrate relations produced, over a sub- 
strate X, by a liberal Datalog program can be defined in X by EFPL 
formulas in the sense of [T]. In fact, we'll obtain the required formulas 
in a simple, explicit manner from the given program 11. 

As a first step, we can rewrite 11 so that the head of each rule involves 
no function symbols, i.e., each head looks like R{xi, . . . , Xn) where the 
variables. 

Q: This was already explained in [6] in the context of an example, but 
the method is clearly general. Given a rule of the form R{ti, . . . ,tn) ^ 
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B, where the U are terms that need not be variables, replace it with 

n 

R{Xi, . . . , Xn) B A /\{Xi = ti) 

1=1 

where the distinct, fresh variables. This modification of 11 has 

no effect on the operator F that it defines, so the superstrate relations 
are unchanged. 

A: Right. When making these modifications to IT, you can also ar- 
range that, if the same relation symbol R occurs in the head of several 
rules, then the same tuple of variables xi, . . . ,Xn is used for all these 
occurrences. 

At this point, we'll gradually move from the syntax of logic rules 
to the syntax of EFPL. Specifically, we'll modify the rules of our pro- 
gram some more, and the resulting bodies will no longer have the form 
required in logic rules but rather will be EFPL formulas. 

If several rules in the program begin with the same superstrate sym- 
bol R and therefore, by the preceding normalization, have the same 
head H, then we combine these rules i/ <— . . . , H ^ B^ into a 
single rule 

i/^ (Si V---VSfe). 

Q: This use of disjunction was allowed in ^ Appendix A. 2. 2]. 

A: Yes, but there it was regarded as syntactic sugar, a mere abbrevi- 
ation of the k separate rules H ^ Bj. Now, we want to regard it as a 
single rule in its own right. 

Next, if the body of a rule contains variables other than those in the 
head, quantify them existentially. That is, replace . . . , Xn) ^ B 

with 

R{xi, ...,Xn)^ i^yi) ■ ■ . i^Vr) B 

where . . . , are all the variables in B other than Is it 

clear that this change doesn't affect the superstrate relations? 

Q: Yes. In fact, it doesn't change the operator F used to define those 
relations. The essential point is that the definition of F was already in 
terms of "some assignment." 

A: Good. Notice also that the bodies of our rules, after these modifi- 
cations, are still EFPL formulas. 

Q: Sure. In fact, they're in the existential fragment of first-order logic. 
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A: Right. Let H' be the current, modified progranj^, let i? be a super- 
strate relation, say n-ary, and consider the EFPL formula (/^(xi, . . . , Xn) 
given by 

LET n' THEN R{xi, . . . , Xn). 

Q: Are the variables Xj after THEN the same ones that were used with 
all occurrences of R in head position in the program? 

A: They might as well be, but it doesn't matter. All variables in the 
H' part of Lp are bound, either by the quantifiers that we explicitly 
introduced into the bodies of rules or by the "LET . . . THEN" con- 
struction. So it doesn't matter which variables they are. The only 
reason we insisted on having the same variables for all occurrences, in 
heads, of the same relation symbol is to be able to combine those rules 
into a single rule. Thus in H' each superstrate relation symbol occurs 
in the head of exactly one rule, as required by the syntax of EFPL. 

Q: Wait a minute. I see why each superstrate relation symbol occurs 
in the head of at most one rule in H'; if it was originally in more than 
one, then you combined those rules using disjunction. But why is it in 
exactly one? What if some superstrate symbol doesn't occur at all? 

A: We ignored that situation because such a symbol would be inter- 
preted as the empty relation over any substrate, so there's really no 
point in including it in the superstrate. But, to be accurate, we should 
cover this case as well, and the disjunction idea still works. So if the 
original program had no rules starting with the superstrate symbol R, 
then H' would have one such rule, R{xi, . . . , x„) ^ B, where B is the 
disjunction of no formulas (the bodies of all the rules with R in the 
head). Since the disjunction of no formulas is, by the only reasonable 
convention, false, we get the rule R{xi, . . . , Xn) false, which has 
the right semantical effect. 

Q: That's a pretty pedantic answer. 

A: It was a pretty pedantic question. 

^To agree exactly with the syntax of [1] , 11' should be regarded as a sequence of 
rules, rather than a set, by ordering its rules arbitrarily. This pedantry is required 
because in [T] we defined the fixed-point construction "LET . . . THEN ..." using 
sequences between the LET and the THEN. Sequences have the advantage that 
formulas are strings of symbols; sets would have the advantage of mathematical 
elegance, since the ordering in the sequence never matters. The same comments 
apply to logic programs. It is curious that the directly writable, sequence convention 
is used in the mathematically oriented paper [1] while the more elegant, abstract, 
set convention is used in the application-oriented paper [6] . 
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Coming back to the EFPL formula defined above, is it clear that 
the relation it defines is exactly the interpretation of the superstrate 
relation R that is produced by the original logic program 11? 

Q: Almost. It's clear that, if we interpreted 11', where it occurs in 
(/p, in the same way that logic programs are interpreted, then it would 
produce the same superstrate relations as 11. But the interpretation 
of rules, and specifically the operator F, is not quite the same in logic 
programs as in EFPL. In the semantics of logic programs, the set of 
tuples described by a rule is added to the current interpretation of the 
relevant superstrate relation symbol. In the semantics of EFPL, the 
same set of tuples alone constitutes the new interpretation of that sym- 
bol. In other words, the F operator for logic programs [6] is explicitly 
designed to be inflationary; that of EFPL [1] need not be inflationary. 

A: That is true, but it doesn't matter. A not-necessarily- inflationary 
operator A and its explicitly inflationary variant F defined by T{A) = 
A (A) U A have the same closed points. 

Q: What do you mean by closed points? 

A: A closed point of F is an A such that T{A) C A. It's fairly common 
terminology to say that a set is closed under an operator. We say 
"closed point" rather than "closed set" because, when there are several 
superstrate predicate symbols, our operators act on tuples of relations, 
not on single sets. 

Coming back to the situation of an operator A and its inflationary 
variant F, it's clear that they have the same closed points. For any 
monotone operator, the least fixed point is also the least closed point. 
And our operators are monotone, because superstrate relations occur 
only positively in the bodies of logic programs (even after we modify 
the programs as above). So F and A have the same least fixed point. 

Q: OK. Actually, I now see another reason why we can ignore the 
inflationary aspect of the F in [B] . If we think of the least fixed point of 
a monotone operator A as being constructed by iterating A, then the 
sequence of iterates is non-decreasing. So, for every A in this sequence, 
AiA)=T{A). 

A: Right. So this completes the proof that the superstrate relations 
defined, over a given substrate structure X, by a logic program 11, as 
in [6], are also defined over X by EFPL formulas (f. Furthermore, the 
transformation of 11 into (f is uniform, i.e., the same for all X. 

Q: Yes. In fact it proves a bit more, namely that any logic program 
can be translated into EFPL formulas of a rather special form: A 
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single LET . . . THEN construct, where the part after THEN is an 
atomic formula consisting of a relation symbol followed by variables. 
Furthermore, the bodies of the inductive definitions (rules) after LET 
are existential first-order formulas. 

That worries me, because for the other half of the equivalence be- 
tween logic programs and EFPL, you'll have to find logic programs 
equivalent to arbitrary EFPL formulas, not just those of this special 
form. 

A: That's right, but you needn't worry. Every EFPL formula is equiv- 
alent to one in this special form, and the equivalence to logic programs 
is one way to prove this. 

4. From Formulas to Logic Programs 

A: Given an arbitrary EFPL formula (p{ui, . . . , with free variables 
among those indicated, we shall transform it into a logic program H 
such that a particular n-ary superstrate relation, as defined by H over 
any substrate X, is the set of n-tuples that satisfy 9? in X. That 
will complete the proof that logic programs and EFPL formulas are 
equivalent; they can be regarded as two ways of presenting the same 
logic. 

As a first step, we'll show that every EFPL formula is logically equiv- 
alent to one of the special form 

LET Fi(f^) ^61,..., Pkix') ^ 5k THEN ^ 

where all the formulas Si and are existential first-order formulas and 
where the free variables of any 6i are among the variables serving as 
the arguments of the corresponding Pi in the definition Pi{x^) <— 5i. 

Q: Since "LET . . . THEN" binds these x's, the only free variables in 
such a formula are those in il). 

A: Right; that will simplify part of the proof. We should also mention 
that = is allowed; then the formula above amounts to just ip. 

Q: I bet your proof that all EFPL formulas are equivalent to ones of this 
special form is an induction on formulas, and by allowing A; = you've 
made the cases of atomic formulas and of negated atomic formulas 
trivial (whereas otherwise they would only have been obvious). 

A: Right on both counts. Conjunction and disjunction are also easy. 
Given two formulas in the desired form, rename the bound predicate 
variables of the LET-clause — the Pj's in the notation above — in one 
of them so as to be distinct from those of the other formula. Then just 
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combine their LET-clauses and form the conjunction or disjunction of 
the THEN-clauses. 

Existential quantification is even easier. Leave the LET-clause alone 
and quantify the THEN-clause. 

Q: This simple argument for the existential quantifier case makes use 
of your convention that the only free variables in 5i are among the . 
Without this convention, you'd have to consider the possibility that 
some other variable free in some 5i is being quantified. 

But, in keeping with the principle that there's no free lunch, it seems 
that you'll have to pay for this convention in the one remaining case of 
your induction. Given a LET-THEN formula, you'll have to get rid of 
any extraneous free variables in its LET-clause. 

A: You're right, but in this case the lunch is fairly cheap. 

Suppose we're given a formula LET P[x) <— 5 THEN ip where both 
5 and Tp are of the desired form. 

Q: Wait a minute. Are you assuming that there's only one constituent 
P{x) ^ 5 m the LET part and that P is unary? 

A: Yes, but this is only for notational simplicityQ The general case 
would involve a lot of subscripts but no new ideas. Notice, in particular, 
that if we have several P's with their corresponding 5's, and if each 5 
has a LET-clause with several consitituents, defining predicates Q, then 
each of those Q's needs two subscripts — the first to tell which 5 it's 
in and the second to tell where it is in that 5's LET-clause — and the 
range of the second subscript depends on the first. Subscript-juggling 
in such a case tends to obscure the proof. 

Q: OK, go ahead with your "one unary predicate" proof. Maybe af- 
terward I'll figure out all the subscripts for the general case on my 
own. 

A: Let's start by taking our formula, 

LET P{x) ^ 5 THEN 

and showing how to eliminate any extraneous free variables from 5. 
Continuing to avoid uninformative subscripts, let's suppose y is the 



It is also known that, at least in the presence of two constant symbols, simul- 
taneous positive recursions can be reduced to a single positive recursion, albeit for 
a relation of higher arity. See [51 Theorem IC.l]. The context in [5] is positive 
recursion over full first-order logic, but universal quantification isn't used for this 
theorem. 
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only variable other than x that is free in S. Then we claim our formula 
is equivalent to 

LET P'{x, y) ^ 5' THEN V', 

where 5' and il)' are obtained from 5 and ip by replacing each atomic 
subformula of the form P{t) with P'{t,y). (Of course, we assume that 
bound variables in 6 and "0 have been renamed if necessary so that the 
y's introduced here don't become accidentally bound.) 

To see that the new formula is equivalent to the original, consider the 
binary relation obtained as (the interpretation of) P' from the recursion 
P'{x,y) ^ 6'. If you fix any particular value b & X for y, then the 
resulting unary relation, P'{x, b) is exactly the relation defined by the 
original clause, P{x) ^ 6 with y assigned the value b. 

Q: Yes, that's easy to see if you think of the iterative process leading to 
the fixed points that interpret P and P'. In the new clause, P'{x, y) <— 
6', y behaves simply as a parameter. So, as the binary relation defined 
by this clause grows, from toward the fixed point P', its unary section 
obtained by fixing the second argument as b grows exactly according 
to the original clause P{x) 6 with y denoting b. In particular, the 
agreement between P and a section of P' occurs not only for the final 
fixed points but stage by stage during the iteration. 

A: That's right. But one can also verify the final agreement directly 
in terms of least fixed points without referring to the iteration. If P' is 
the least fixed point of the new iteration, then each of its sections, say 
at b, is the least fixed point of the old iteration with y denoting b. 

Q: I see that the section is a fixed point of the old operator, simply 
because P' itself is a fixed point of the new one, but why is it the least 
fixed point? 

A: If you had a smaller fixed point P~ for the old operator, then you 
could replace the b section of P' with P~ while leaving all the other 
sections unchanged. The result would be a smaller fixed point than P' 
for the new operator. The point here is that we can modify a single 
section independently of the others because the new operator works on 
each section separately. 

Q: I see; this is what I expressed earlier by saying that y behaves simply 
parameter. 

OK, so you can get rid of extraneous free variables in 6. But your 
unary P has become a binary P'. 

A: It's still the case that higher arities (or more P's) contribute only 
notational complications. So, if you don't mind, we'll revert to the 
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unary notation and we'll drop the primes on P, 6, and a. In other 
words, we'll return to the original notation LET P[x) ^ d THEN ijj 
but with the assumption that only x is free in 6. 

Q: OK; I never was a big fan of subscripts. 

A: Good. So let's consider this formula (f. 

LET P{x) ^ 5 THEN ip. 

By induction hypothesis, we know that 6 and i/j are already of the 
desired form, say S is 

LET R{y) ^ p THEN vr, 

and ■0 is 

LET S{z) ^ a THEN 

where p, tt, a, d are existential first-order formulas, and their free vari- 
ables are among those indicated here: 

7r(x), cr(z), Q{u). 

Q: What's this ul It wasn't in any of the previous formulas. 

A: u represents whatever variables are free in the whole formula 93. 
They were called Mi, . . . , at the beginning of this half of the proof, 
but, as usual, we now pretend, for notational simplicity, that there's 
only one such variable. 

Q: OK. All your other limitations on free variables are based on the fact 
that, by induction hypothesis and by the preceding discussion, none of 
the three LET-clauses have extraneous free variables. In particular, 
any variable free in vr would also be free in b and therefore can only be 

X. 

A: That's right. Let's write out in detail, exhibiting not only the 
free variables in each part (as above) but also the predicate variables, 
P, i?, S, that could occur in each part. So (/? looks like 

LET P{x) ^ [LET R{y) ^ p{P, R, y) THEN 7r(F, R, x)] 

THEN [LET S{z) ^ a{P, S, z) THEN 9{P, S, u)]. 

Q: Please wait a minute while I check your claims about which predi- 
cates can occur where. . . . OK, I agree with what you wrote. The point 
is that the predicate variable introduced before a ^ in a LET-clause 
is allowed to occur at the right of that ^ in that LET-clause and also 
in the associated THEN-clause but not elsewhere. 
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A: Right. Now we claim that ^ is equivalent to the following formula 

LET F(x)^7r(P,i?,x), R{y) ^ p[P,R,y), S{z) ^ a{P, S, z) 

THEN e{P,S,u). 

Q: Essentially, you've just lumped together all the LET-clauses in tp, 
ignoring the nesting of the clause for R inside the clause for P, and 
made one big LET-clause out of all of them. Not very subtle. 

A: But it works. The first step toward the proof that it works is setting 
up some notation that is neither cumbersome nor ambiguous. (Either 
problem alone is easily avoided.) We propose the following. 

Fix a structure and a value for the free variable u of (f. Since they're 
fixed, we won't mention them in our notation. To further simplify 
the notation, we'll generally use the same symbols for syntactic enti- 
ties (like the predicates P, R, S and the variables x, y, z) and possible 
semantic interpretations of them in our structure. 

Now let's look at our formula (f and set up some notation for the 
various fixed points that occur in it. We begin with the LET-clause 
R{y) «— p{P, R, y) that defines the fixed-point interpretation for R. 

Q: Why not start with the first LET-clause, the one for P? 

A: The defining formula 6 in that clause involves the fixed point for 
the R clause, so it's useful to settle the R part of the notation first. 

For any particular interpretation of the predicate P — and, as indi- 
cated above, we'll use the same symbol P for the interpretation — p 
defines a least fixed point that we'll call R°°{P). 

Next, the LET-clause for P amounts to using the definition 7r(P, R, x) 
but with R interpreted as R°°{P). (Indeed, that's the semantics of 
"LET R{y) ^ p{P,R,y) THEN 7t{P,R,x).") Note carefully that the 
monotone operator described by this clause, 

P^{x: n{P,R°°{P),x)} 

uses its input P twice — first in the first argument of vr and again via 
the dependence on P of the second argument R°°{P). We denote the 
least fixed point of this operator by 

Similarly, the LET-clause for S uses the definition a{P, S, z) with 
P interpreted as P°°. We write S°° for the least fixed point of this 
operator. 

Next, we need notation for the three predicates obtained as the least 
fixed point of the simultaneous recursion in ip'. Having already used 
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superscripts oo for another purpose, we'll use stars instead for this 
triple of fixed points, calling them P*,Q*, R*. 

In connection with all these fixed points, it is useful to remember 
that the least fixed point is also the least closed point. Thus, for 
example, P*,Q*, R* can be characterized as the smallest relations that 
simultaneously satisfy the implications 

(1) \/x[7r{P*,R*,x) =^ P*{x)] 

(2) \/y[piP*,R*,y) =^ R*iy)] 

(3) \/z[a{P*,S*,z) =^ S*{z)] 

(whereas for fixed points we'd write bi-implications). 

The essence of the proof is to show that the fixed points arising from 
(p and from ip' match as follows. 

P* = P^^ R* = R'^{P*), and S* = S'^. 



Q: It's clear that, once you establish these equations, you're done. 
After all the truth value of ip is, by definition, obtained by evaluating 
6 with P and S interpreted as P°° and S°°, while the truth value of Lp' 
is obtained by evaluating the same 6 (in the same structure with the 
same value for u, as fixed earlier) with P and S interpreted as P* and 
S*. In fact, this last part of the proof won't need the equation for R, 
with its curious mixture of oo and * on the right side; presumably that 
equation is needed as an intermediate step in the proof of the other 
two equations. 

So please go ahead and prove the * = oo equations. 

A: OK. We'll start by showing that R* = R°°{P*). Formula (2) says 
exactly that R* is a closed point of the operator R {y : p{P*, R, y)}. 
According to the definition of R°°{P), applied with P instantiated as 
P*, the least closed point of this operator is R°^{P*). So we immedi- 
ately have that R°°{P*) CR*. 

Furthermore, all three of the formulas (1), (2), and (3) remain true 
if we replace R* by R°°{P*). For (1), this follows from the fact that 
i? is a positive predicate symbol (otherwise it couldn't have occurred 
on the left of <— in (p) and so vr is monotone with respect to R. When 
we replace R* by R°°{P*), the interpretation of R can only decrease. 
That strengthens the antecedent in (1) and thus preserves the truth 
of (1). The argument for (2) is easier; the result of replacing R* by 
R°°{P*) there is just the fact that R°°{P*) is by definition closed under 
the operator defined by n with P interpreted as P*. Finally, the case 
of (3) is trivial, as R* doesn't occur there. 
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Thus, the triple P* , R°°{P*), S* is closed under the simultaneous 
recursive operator whose least closed point is P*,R*,S*. Therefore 
R* C i?°°(P*), and the * = oo equation for R is proved. 

Let's turn next to the equation for P. In view of what we've already 
proved, formula (1) can be written as 



which says that P* is closed under the operator P i— > {x : ir{P, R°°{P), x)}, 
whose least closed point is P°°. So we have P°° C P*. 

Furthermore, all three of the formulas (1), (2), and (3) remain true 
if we replace P* by P°° and replace R* by _R°°(P°°). In the case of (1), 
this is just the closure condition defining P°°. In the case of (2), it's the 
closure condition defining R°°{P°°). And in the case of (3), it follows 
from the facts that a is monotone with respect to P and P°° C P*. 

Thus, the triple P°° , R°° {P°°) , S* is closed under the simultaneous 
recursive operator whose least closed point is P*, R*, S*. Therefore 
P* C P°°, and the * = oo equation for P is proved. 

Finally, we turn to S. In view of the equations already proved, 
formula (3) is equivalent to 



which says that 5"* is closed under the operator whose least closed point 
is 3°^. Therefore, CS*. 

Furthermore, all three of the formulas (1), (2), and (3) remain true if 
we replace P* by P°°, R* by P°°(P°°), and S* by S^. Since S* doesn't 
occur in (1) and (2), the argument given for them above still applies. 
As for (3), the formula we get is just the closure requirement in the 
definition of S°°. 

Thus, the triple P°°, P°°(P°°), 5°° is closed under the simultaneous 
recursive operator whose least closed point is P*,R*,S*. Therefore, 
S* C S°°, and so the proof is complete. 

Q: You mean the proof of the * = oo equations. So you've com- 
pleted the inductive proof that every EFPL formula can be put into 
the normal form LET P{x) ^ 6 THEN ip, with 6 and ifj in existential 
first-order logic. And you have the additional normalization that 6 has 
no free variables but x (except that you might really have lots of P's 
of arbitrary arities). You still have to convert this into a logic program 
of the form used in [6] . 

A: Right, but the rest is fairly easy. 

First, we can arrange that the formula after "THEN" is an atomic 
formula Q{u) where w is a tuple of distinct variables (not compound 
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terms). Indeed, if we let u list all the variables free in the given formula 
LET P{x) ^ 5 THEN ip (hence free in -0) and if we let Q be a new, 
positive predicate symbol of the right arity, then this given formula is 
easily equivalent to 

LET P{x) ^ 5, Q{u) ^ THEN Q{u). 

Q: I see: Since Q doesn't occur in ■0, the "recursive" definition Q{u) ^ 
ip isn't really recursive. The relevant iteration takes just one step (once 
P has reached its fixed point) and, in effect, makes Q{u) an alias for 
^. 

A: Right. Furthermore, as Q doesn't occur in 5, nothing has changed 
in the recursive definition of P. 

So our formula has been equivalently rewritten as 

LET Pi(fi) ^ 5i, . . . , Pk{x^) ^ 5k THEN Pk{u). 

Q: You must really be getting near the end of the proof, since you've 
restored the multiple P's and their subscripts. 

A: Yes. What remains is to transform each of the existential first-order 
formulas 5i as follows. 

We can put each 5i into prenex form and then put its quantifier-free 
matrix into disjunctive normal form. So 5i now looks like 

r s 

where the a's are atomic or negated atomic formulas. 
Then the required logic program H consists of the rules 

s 

one for each i and r. 

Q: This is very similar to what happened in the first part of the proof, 
translating logic programs into EFPL formulas. The monotone op- 
erators defined by the transformed EFPL formula and by the logic 
program are identical as operators. So they certainly have the same 
least fixed point. In particular, the Pk component of that least fixed 
point, which is one of the superstrate relations defined by the program, 
is the interpretation, in the substrate, of the original EFPL formula. 

Remark 3. Although, as mentioned in a footnote earlier, [6] is directed 
toward applications and [1] is more theoretical, EFPL does have one ad- 
vantage over liberal Datalog from the programming point of view. In 
a liberal Datalog program, all variables are global, but in an EFPL 
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formula, the LET-THEN construction provides local variables with 
scopes. The latter can be important for large-scale programming, by 
making it easy to assemble small modules into a large program (or 
formula) . 
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